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Abstract 

The focus here is on the early development (mid 1970’s-1980’s) at NASA 
Ames Research Center of implcit methods in Computational Fluid Dynamics 
(CFD). A class of implicit finite difference schemes of the Beam and Warm- 
ing approximate factorization type will be addressed. The emphasis will be 
on the Euler equations. A review of material pertinent to the solution of the 
Euler equations within the framework of implicit methods will be presented. 
The eigensystem of the equations will be used extensively in developing a 
framework for various methods applied to the Euler equations. The devel- 
opment and analysis of various aspects of this class of schemes will be given 
along with the motivations behind many of the choices. Various acceleration 
and efficiency modifications such as matrix reduction, diagonalization and 
flux split schemes will be presented. 


1. Introduction 

The development employed here is based on the implicit approximate 
factorization algorithm of Beam and Warming [1] . A particular application 
in two dimensions was first presented by Steger [2] and for three dimensions 
by Pulliam and Steger [3]. While there have been numerous developments 
and variant implementations of implicit methods over the years, the original 
work of Beam, Warming and Steger stands out as the backbone of modern 
methods. I shall concentrate here on the theoretical development, application 
and assessment of the implicit algorithms of the Beam- Warming- Steger type 
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as they were developed at NASA Ames Research Center in the early 1970’s 
to the 1980’s. 

There are a number of considerations to weigh when choosing a numeri- 
cal algorithm to apply to a set of partial differential equations. If we restrict 
ourselves to finite difference schemes then the possibilities are narrowed some- 
what to the two classical approaches for time integration, explicit and implicit 
techniques. The merits of these have been extensively discussed in the lit- 
erature. Explicit methods typically require less computational work and are 
simpler both in derivation, application and parallelization. Implicit meth- 
ods, while computationally expensive, have less stringent stability bounds 
(classical stability analysis shows unconditional stability but in practice on 
nonlinear problems bounds are encountered). 

Implicit numerical schemes are usually chosen because we wish to obtain 
solutions which require fine grid spacing for numerical resolution, and we do 
not want to limit the time steps by employing a conditionally stable explicit 
scheme. Explicit schemes are very useful and schemes such as MacCormack’s 
explicit algorithm [4] were used extensively in the 1970’s - 1980’s. The extra 
work required for an implicit scheme is usually offset by the advantages ob- 
tained by the increased stability limits, and in general implicit schemes have 
been very useful and successful for a variety of inviscid and viscous flowfield 
calculations. 

For unsteady, transient problems we wish to employ time accurate meth- 
ods, initialize the flow with some realizable state and integrate forward in 
time with time steps commensurate with the unsteady phenomenon being 
simulated. Both implicit and explicit methods are capable of computing time 
accurately. In steady state calculation we wish to integrate from some arbi- 
trary state to the asymptotic solution in any manner which will get us there 
with the least amount of computational work. Non-time-accurate techniques 
(for instance relaxation methods, variable time steps, matrix precondition- 
ing, large time steps) can be employed as long as they are convergent and 
do not distort the steady state equations so as to produce inaccurate results. 
The methods presented below can be employed either for time accurate cal- 
culations or for steady state rapidly convergent solutions. 

The algorithm to be presented is an implicit approximate factorization 
finite difference scheme which can be either Erst or second order accurate in 
time. Local time linearizations are applied to the nonlinear terms and an 
approximate factorization of the two-dimensional implicit operator is used 
to produce locally one- dimensional operators. This results in block tridi- 
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agonal matrices, which are easy to solve. The spatial derivative terms are 
approximated with finite differences of various orders of accuracy, involv- 
ing stencil sizes from 3 to 9 points. Explicit and implicit artificial dissipation 
terms (which will not be discussed here, see Pulliam [5]) are added to achieve 
nonlinear stability. A spatially variable time step is used to accelerate con- 
vergence for steady-state calculations. A diagonal form of the algorithm is 
also discussed, which produces a computationally efficient modification of the 
standard algorithm where the diagonalization results in scalar tridiagonal or 
pentadiagonal operators in place of the block operators. This diagonal form 
of the algorithm produces a robust, rapid and versatile scheme for steady 
state calculations. 

2. Euler Equations 

The Euler equations are comprised of the inviscid compressible continu- 
ity, momentum and the energy equations. The Euler equations are of interest 
for a number of reasons. They are the next step after the potential equation 
in the hierarchy of equations which lead to the full Navier-Stokes equations. 
Besides being valid for use in applications where viscous effect are negligible, 
they are often used in analysis and development of algorithms which even- 
tually are applied to the Navier-Stokes equations. Since the equations are 
capable of capturing, convecting and creating vorticity, they are often used to 
simulate vortical flows where either physical mechanisms (such as shocks) or 
artificial mechanisms (fixed stagnation points, numerical dissipation) account 
for the production of vorticity. In some cases, the resulting flows represent 
acceptable physical solutions and in others the validity of the Euler solution 
is in question relative to a more physical Navier-Stokes equation solution. 

We shall restrict ourselves to the Cartesian form of the two dimensional 
(2-D) equations in strong conservation law form. Strong conservation law 
form is chosen because we wish to admit shock capturing. Typically the 
extension of ideas to three dimensions (3-D) is rather straightforward. It 
is usually a mistake to restrict oneself to just 1-D equations, since ideas 
developed for 1-D often are difficult to extend formally to multidimensions. 
In contrast, the extension from 2-D to 3-D is more easily accomplished. 

2.1. Equations 

The Euler equations in conservation law form are 

dtQ + d x E + d y F = 0 (1) 
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Pressure is related to the conservative flow variables, Q, by the equation 
of state 

P=( 7 “ 1) (e - \p{ u2 + v 2 )^ (3) 


where 7 is the ratio of specific heats, generally taken as 1.4. The speed of 
sound is c which for ideal fluids, c 2 = 7 p/p. 


d t e + d x (puh ) + dy(pvh) = 0 (4) 

2.2. General Form 

First, let us recast Eqs. 1 - 3 in a more general form 


Qi 


(12 


Q3 

<12 

,E = 

Q%/qi+p(q) 

,F = 

q2qs/qi 

Q3 

mz/qi 

ql/qi+p(q) 

. ^ . 


. 92 (94 +p(q)) /qi _ 


_ qsiqi+piq)) M . 


with 

p(9) = (7-i)(94-^(gi + gf)/gi) (6) 

In using Eqs. 5, 6 we will always assume that the variables are linearly 
independent. This is important when we will be examining linearizations 
and Jacobians of the fluxes. 


2.3. Flux Jacobians 

The fluxes, E and F, defined in the previous section are nonlinear func- 
tions of Q. In stability analysis and design of numerical algorithms for the 
Euler equations, the flux Jacobians 
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play a dominant role. 

For example, if we attempt to use the 1 st order implicit scheme (define 

Q n — Q(n ■ At)) 

n n + 1 _ n n 

At + d x E(Q n+1 ) + d y F(Q n+1 ) = 0 (8) 

to integrate Eq. 1, the second and third terms are nonlinear in Q n+1 . We 
can linearize that term to 2 nd order accuracy by a Taylor series expansion 

E n+1 = E n + A n (Q n+1 - Q n ) + 0 (At 2 ) (9) 

and similarly for F n+1 , resulting in 

[/ + A td x A n + A td y B n ] (Q n+1 - Q n ) = -At (, d x E n + d y F n ) 


which is now linear in the solution variable Q n+1 . 

The easiest way to derive the flux Jacobians is to start with the general 
form of the fluxes given in Eq. 5. The elements of A are defined as 


Aij = ^ ( 10 ) 

where the qj are the independent variables. For example, E2 = pu 2 + p = 
q 2 jq\ + p(q) and the element A 2 , 1 is found to be 


_dE 2 _ q 2 (7 - 1) / q\ + q\ 
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with (f) = ( - +l ’ J . 

The Jacobian matrices for the two- dimensional Euler equations are 
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2-4- Homogeneous Property 

The fluxes of the Euler equations have the very interesting and useful 
property of being homogeneous of degree 1, E(sQ) = sE(Q). Since the fluxes 
are homogeneous of degree 1 they can be shown to satisfy E = AQ, F = BQ 
exactly. Beam and Warming took advantage of this in the original develop- 
ment of their implicit approximate factorization algorithm [1], Steger and 
Warming [6] also used this property as an integral part of their development 
of a flux split algorithm. We can use the homogeneous property here to show 
that, for instance, 


dE _ dAQ _ A dQ + dA 
dx dx dx dx ^ 


also 


dE _dEdQ _ A dQ 
dx d Q dx dx 


which implies that 



0 


(14) 


(15) 


(16) 


One can verify this by using Eqs. 2 and 12,13. Similar expressions hold for 
any derivative of E and F. Using such relations we can form the quasilinear 
form of Eq. 1 


dtQ + Ad x Q + Bd y Q — 0 (17) 

2.5. Eigenvector Matrices 

The flux Jacobians A and B each have real eigenvalues and a complete 
set of eigenvectors. Therefore, the Jacobian matrices can be diagonalized. 
Warming, Beam, and Hyett [7] consider a general matrix which is a linear 
combination of A and B, 

A = k x A + K y B (18) 

The diagonalization similarity transformation is 

A k = T~ l AT K (19) 
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where T K is the matrix whose columns are the eigenvectors of A and T K 1 is 
the corresponding left eigenvector matrix. 
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with U = k x u + K y v , 0 2 = 7(7 — l)(w 2 + v 2 ), and (3 = 1/(2 c 2 ), 6 = 

k x u + K y v, and, for example, k x = k x / \J k\ + k 2 . 

We can recover the individual eigenvalue and eigenvector matrices for 
A and B by using k x = 1, n y = 0 for Ta,T^ , Aa and k x = 0, K y = 1 for 
Tb, Tg 1 , A b- An interesting relation exists between Ta and Tb of the form 
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with /i = l/y/2. 

Note that the matrix At is not a function of the flow variables and is in 
fact a constant matrix. 
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It is not possible to simultaneously diagonalize both of the flux Jacobians 
of the Euler equations. It is possible to simultaneously symmetrize the equa- 
tions which is often useful in stability analysis. The eigenvector matrices T n 
and T~ x will diagonalize one and symmetrize the other flux Jacobian matrix 
depending on the choice of k x and K y . 

2.6. Flux Vector Splitting 

In the early years of CFD, a number of schemes were developed based 
on upwind differencing. The flux split scheme of Steger and Warming [6] 
employed a decomposition of the flux vectors in such a way that each element 
can be stably differenced in an upwind fashion. These schemes all claim (with 
good justification) to be physically consistent since they follow in some sense 
the characteristics of the flow. They in general can be shown to produce 
sharp oscillation free shocks without added artificial dissipation. It should 
be noted that these schemes have an inherent amount of internal dissipation, 
due to the one sided differences. 

The plus - minus flux split method of Steger and Warming [6] will be 
used here to introduce the concept of flux splitting. The approach taken is 
to split the eigenvalue matrix A of the flux Jacobians into two matrices, one 
with all positive elements and the other with all negative elements. Then the 
similarity transformations TJ or T Pj are used to form new matrices A + , A~ 
and B + , B~ . Formally, 

A = TaAaT^ 1 = T a {A\ + A a)^ 1 = A+ + A 

with 

* ± = a a ± ]Aa| 

A 2 

Here, |A| implies that we take the absolute values of the elements of A. The 
two matrices, A + and A~ have by construction all positive and all negative 
eigenvalues, respectively. 

New flux vectors can be constructed as 



(25) 


E = AQ = (A + + A~)Q = E + + E~, F = BQ = ( B + + B~)Q = F + + F~( 27) 


The Euler equations can now be written 

d tQ + d x E + + d x E + d y F + + d y F = 0 


(28) 



Different type of spatial differencing can now be used for each of the above 
flux vector derivatives. In the case of the + terms a backward difference in 
space can be used ( forward difference for the — terms), the resulting scheme 
maintains linear stability for the resulting ODE. 

A generalized flux vector can be defined as 

F = T k AT-'Q 

where 

Ai 

A 2 

A3 

A 4 

with A any definition of an eigenvalue. 

For example, we can have X ± from Eq. 26 to get E ± , or we could use A* = 

u : i — 1, 2, 3,4 producing a E u and A.; = 0 : i = 1, 2 with A3 = c, A 4 = — c 

producing a E c . Note that then E = E u + E c . 

The generalized flux vector, see Steger and Warming [6], is written as 

^2(7 - l)Ai + A 3 + 

2(7 — l)Ai« + A 3(11 + ck x ) + A 4 (w — ck x ) 

2(7 - l)Aiu + A 3 (u + cky) + A 4 (u - Cky) 

fi 

where / 4 = (7 - 1)Ai(m 2 + v 2 ) + X 3 [(u + ck x ) 2 + (v + ck y ) 2 }/2 + A 4 [(m - ck x ) 2 + 
{v - cky ) 2 ]/ 2 + (3 - 7)(A 3 + A 4 )c 2 /(2(7 - 1)) and 

Ai = A 2 = K x u + K y V, A 3 = Ai + C^J K-l + Ky, A 4 = Ai - C^J K.I + Ky (32) 
The original E and F can be recovered with the appropriate values of k x and 

Ky. 

There is a very fundamental consideration pertaining to the flux vectors 
defined above. In the case of the ± flux splitting and other splitting which 
are a direct result of the similarity transform, Eq. 29, it is more often the 
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rule than the exception that the flux Jacobians of the resulting flux vectors 
are not equal to the similarity matrices, i.e. 


d F ± 

~dQ 


J tA ± 


(33) 


We shall not write out the exact Jacobians here, but we should note that for 
the ± splitting it has been shown that the eigenvalues of the exact Jacobians, 
while not A ± , are positive and negative appropriately for the + and — fluxes. 


3. Implicit Numerical Algorithms 

3.1. Implicit Time Differencing 

Consider an implicit three point time differencing scheme of the form ( 
Warming and Beam [1]) 


$/\t d 

a< 2 "~s ( a<2 " )+ 
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(d - - - ip) At 2 + At 3 


(34) 


where AQ n = Q n+1 — Q n and Q n = Q(nAt). The parameters tt and p can be 
chosen to produce different schemes of either first or second order accuracy 
in time. 

For i) — 1 and p = 0, we have the first order Euler implicit scheme, and 
for i) — 1 and p = 1/2, the three point implicit scheme. 

Let us restrict ourselves to the first order scheme in time (although all 
of the subsequent development can easily be extended to any second order 
scheme formed from Eq. 34. Applying Eq. 34 with i) — 1 and p = 0 to Eq. 1, 
results in 


Q n+ 1 -Q n + At ( d x E n+l + d y F n+1 ) = 0 (35) 

3.2. Local Time Linearizations 

We wish to solve Eq. 35 for Q n+l given Q n . The flux vectors E and F 
are nonlinear functions of Q and therefore Eq. 35 is nonlinear in Q n+l . The 
nonlinear terms are linearized in time about Q n by a Taylor series such that 


E n+ 1 = E n + A n AQ n + 0(At 2 ) F n+1 = F n + B n AQ n + 0(At 2 )(36) 
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where A = dE/dQ , and B = dF/dQ are the flux Jacobians and A Q n is 
O(At). The Jacobian matrices A and B are griver by Eq. 12, 13. 

Note that the linearizations are second order accurate and so if a second 
order time scheme had been chosen the linearizations would not degrade the 
time accuracy. 

Applying Eqs. 36 to Eq. 35 and combining the A Q n terms produces the 
“delta form” of the algorithm 

[/ + A td x A n + A td y B n ] A Q n = -At (d x E n + d y F n ) (37) 

This is the unfactored form of the block algorithm. We shall call the 
right hand side of Eq. 37 the “explicit” part (RHS) and the left hand side 
the “implicit” part (LHS) of the algorithm. 

3.3. Space Differencing 

The next step is to take the continuous differential operators d x and d y 
and approximate them with finite difference operators on a discrete mesh. 

Introducing a grid of mesh points ( j,k ), variables are defined at mesh 
points as 

u j,k ■= u(j Ax, kAy) (38) 

The choice of the type and order of the spatial differencing is important 
both in terms of accuracy and stability. In most applications second order 
accuracy has proven to be sufficient provided the grid resolution is reasonable. 
The choices for differencing type include central and upwind operators. These 
choices are dictated by stability, accuracy, efficiency and programming issues. 

Second order central difference operators can be used where, for example, 

(^h?+i,fc ^j— 1,&) /(2Ax) , ^y^j,k ( u y,k+i ^j,k— i) /(2A y) (39) 

or upwind operators for the type dependent schemes 

^x^j,k (3uj,fc 4 zij—i k T Uj—2,k) / (2Ax), 

5 b y u jtk = (3 U j)k - 4uy fc _i + My fc_ 2 ) / (2A y) (40) 

with a similar forward differencing forms. 

We will not focus here on the details of the difference schemes, bound- 
ary conditions, general geometry, etc. The pertinent aspect of the finite 
differences is the stencil size and form obtained when a difference scheme is 
chosen. In general, three to seven point operators can be employed, which 
has a direct impact on the bandwidth of the resulting implicit operators. 
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3-4- Matrix Form of Unfactored Algorithm 

We now turn to examining the matrices we get when difference formulas 
are applied to the implicit algorithm. It is always instructive to examine the 
matrix structure of any finite difference equation. With the application of 
central differences to Eq. 37. it is easy to show that the implicit algorithm 
produces a large banded system of algebraic equations. Let the mesh size in 
x be J max and in y be K max . Then the banded matrix is a (J max ' K max • 4) 
x ( Jmax ' Kmax ' 4) square matrix. Let h = At, we have for the LHS matrix 
operator [/ + hS x A n + h5 c y B n ] from Eq. 37, the matrix structure 




Bj,k — 1 
~2A^~ 



Aj — lk 
2Ax 


[i] 


U A i+ i,fc 
n 2Ax 


U Bj,k + 1 

u 2A y 


(41) 


where the variables have been ordered with j running first and then k. 

The matrix is sparse but it would be very expensive (computationally) 
to solve the algebraic system. For instance, for a reasonable two-dimensional 
calculation of transonic flow past an airfoil we could use approximately 300 
points in the x direction and 80 points in the y direction. The resulting 
algebraic system has a 96,000 x 96,000 matrix problem to be solved and 
although we could take advantage of its banded sparse structure it would 
still be very costly in both CPU time and memory. For a banded matrix 
with bandwidth b and rank N, the operation count estimate for inversion 
is 0(b 2 N). In two dimensions, the bandwidth (for Eq. 41) is approximately 
4 x K max (4 for the block size and K max from the differencing in k for each 
j line) for the above ordering. The rank is N = 4 x J m ax x K max , so that 
the total operation count for the inversion is on the order of 0(4 3 x J max x 
K'max)- For the above example grid sizes this gives ~ 9.8 x 10 9 operations per 
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time step. In two dimensions this is not too difficult for today’s large scale 
computer systems. In three dimensions, the block sizes are 5, bandwidth 
is expanded by the extra dimension and the rank is also expanded leading 
to a large operation count which is still beyond the capabilities of modern 
computer systems. 

3. 5. Approximate Factorization 

As we have seen, the integration of the full two-dimensional operator is 
expensive. One way to simplify the solution process is to introduce an approx- 
imate factorization of the two-dimensional operator into two one- dimensional 
operators. The implicit side of Eq. 37 can be written as 

[I + hS x A n + hSy B n ] A Q n = (42) 

[I + hS x A n ] [ I + h8 y B n } AQ n - h 2 8 x A n 8 y B n AQ n 

The cross terms ( h 2 terms) are second order in time since AQ n is 0(h). 
They can therefore be neglected without degrading the time accuracy of any 
second order scheme which we may choose. 

The resulting factored form of the algorithm is 

[. I + h8 x A n } [I + h8 y B n ] A Q n = -h [8 x E n + 8 y F n ] (43) 

We now have two implicit operators each of which is block tridiagonal. 
The structure of the block tridiagonal matrix [I + h8(j.A n ] is 


7 Vu 
" 2 Ax 


[I] 


7 Aj + l.fc 

2Ax 


(44) 


Rewriting Eq. 43 as a two stage process, we have 

Form R n — — h [8 x E n + 8 y F n ] (45) 

For all k lines [I + h8 x A n ] A Q n = R n :: Block tri diagonal solve in j 

For all j lines [I + h8 y B n ) AQ n = AQ n :: Block tri diagonal solve in k 
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The solution algorithm now consists of two one dimensional sweeps, one in 
the x and one in the y direction. The block matrix size is now at most rank 
N = (4 x msLx[J max , K max \) with bandwidth 6 = 8. Each step requires the 
solution of a linear system involving a block tridiagonal which is solved by 
block LU (lower-upper) decomposition. For the above example, there are 
K max block tridiagonal inversions in x and J ma x block tridiagonal inversions 
in y, resulting in 2 x ( J ma x x K max x 4) x 8 2 ~ 12 x 10 6 operations, a decrease 
in work by a factor of about 800. The resulting solution process is much more 
economical than the unfactored algorithm in terms of computer memory and 
CPU time. 

3.6. Diagonal Form 

Most of the computational work for the implicit algorithm is tied to the 
block tridiagonal solution process. The computational work can be decreased 
by introducing a diagonalization of the blocks in the implicit operators as de- 
veloped by Pulliam and Chaussee [8]. The eigensystem of the flux Jacobians 
A and B are used in this construction. 

The eigenvalues and eigenvectors of the flux Jacobians A and B are given 
in Section 2.5 and we have 

A x = T^ 1 AT a and A y = T^ ] BTb (46) 


Here we take the factored algorithm in delta form, Eq. 43 and replace A 
and B with their eigensystem decompositions. 

[Ta TX 1 + hS x ( Ta A x TX 1 )] [T b TX 1 + h S y (T b A y T^ 1 )] A Q n = R n (47) 

At this point Eq. 43 and Eq. 47 are equivalent. A modified form of Eq. 47 
can be obtained by factoring the Ta and T B eigenvector matrices outside the 
spatial derivative terms 5 X and S y . The resulting equations are 

T a [I + h 8 X A x \ N [I + h 5 y A y ] T^AQ n = R n (48) 

where N = Tj'T/j, see Eq. 23. 

The explicit side of the diagonal algorithm (the steady-state finite differ- 
ence equations) is exactly the same as in the original algorithm, Eq. 43. The 
modifications are restricted to the implicit side and so if the diagonal algo- 
rithm converges, the steady-state solution will be identical to one obtained 
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with the unmodified algorithm. In fact, linear stability analysis would show 
that the diagonal algorithm has exactly the same unconditional stability as 
the original algorithm. (This is because the linear stability analysis assumes 
constant coefficients and diagonalizes the blocks to scalars, so the diago- 
nal algorithm then reduces to the unmodified algorithm.) The modification 
(pulling the eigenvector matrices outside the spatial derivatives) of the im- 
plicit operator does affect the time accuracy of the algorithm. The eigenvec- 
tor matrices are functions of x and y and therefore this modification reduces 
the time accuracy to at most first order in time and also gives time accurate 
shock calculations a nonconservative feature, i.e., errors in shock speeds and 
shock jumps. However, the steady-state is fully conservative since the steady- 
state equations are unmodified. Also, computational experiments by Pulliam 
and Chaussee [16] have shown that the convergence and stability limits of 
the diagonal algorithm are similar to that of the unmodified algorithm. 

The diagonal algorithm reduces the block tridiagonal inversion to a set of 
4x4 matrix multiplies and scalar tridiagonal inversions. In two dimension, 
reqrite Eq. 48 


Form 
For all j and k 
For all k lines 
For all j and k 
For all j lines 
For all j and k 


R n = -h [S x E n + S y F n ] 
S n = T^ l R n 
S n =[I + h5 x A x ]- 1 S n 
S n = N~ 1 S n 
S n = [I + hS y A y ]~ 1 S n 
A Q n = T B R n 


(49) 

4x4 matrix multiples 
3 scalar tridiagonal solves 
4x4 matrix multiples 
3 scalar tridiagonal solves 
4x4 matrix multiples 


Pulliam and Chaussee [8] showed that the operation count per grid point 
associated with the implicit side of the full block algorithm is 410 multiplies, 
356 adds, and 10 divides, a total of 776 operations, while the diagonal al- 
gorithm requires 233 multiplies, 125 adds, and 26 divides or 384 operations. 
Adding in the explicit side and other overhead such as I/O (input/output) 
and initialization, the overall savings in computational work can be as high 
as 40%. Note the the computational work was further decreased by noting 
that the first two eigenvalues of the system are identical, Eq. 20. This allows 
us to combine the coefficient calculations and part of the inversion work for 
the first two scalar operators. Another advantage of the approximate factor- 
ization is that the solution process is easily parallelized. One can parallelize 
in y as you invert in x and parallelize in x for the y inversions. 
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3.7. Flux Split Forms 

Flux split forms of the implicit algorithms have been exploited in many 
ways. Starting with Steger- Warming [6] flux vector splitting and the equa- 
tions given in Section 2.6 we have 

d t Q + 5 b x E + + 5 f x E~ + 5 b F + + 5 f y F- = 0 (50) 

where the appropriate upwind differences are used. A first order implicit 
delta form for Eq. 50 is 

[/ + hS b (A+) n + hS'(A-) n + hS\(B + T + hS f t (B~) n ] A Q" = (51) 

-ft K(£T + H(E~r + K( F+ ) U + ■Wt) 

Applying Erst, second or higher order upwind differencing would produce 
a sparse large bandwidth system similar to the unfactored block implicit 
scheme, Eq. 41 and work estimates which are excessive, especially in three 
dimensions. Approximate factorization similar to Section 3.5 would produce 


[/ + hS^A+r + hS f x (A-) n ] [/ + h5 b p (B+) n + ] A Q n = (52) 

-a (s b (E+r + + ^( f +)" + sf(F-r) 

which produces work estimates similar to the approximate factorization of 
that Section. 

An alternative factorization would be to keep the positive terms together 
and the negative terms together producing 

[/ + hS b x (A + ) n + A^B+r] [/ + hSpA-y + ft 6'(B-) n ] A Q" = (53) 

-ft (K(E + )" + 4(E-) n + ftS(F + ) n + H(F-r) 

Now each implicit operators is block tridiagonal, e.g. the structure of 

[/ + hS b (A+r + ftft‘(B + )“] 
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is a lower block diagonal matrix 



Similarly, the forward operators produce an upper block diagonal matrix. 
The advantage of this factorization is that each matrix operator can be in- 
verted by a forward and then a backward sweep process, e.g. a Lower-Upper 
(LU) factorization solution process. The disadvantage of the LU schemes is 
encountered when parallelization is attempted and the natural recursion for 
the forward and backward sweeps preclude an easy parallel implementation. 

4. Summary 

The implicit schemes developed in the 1970’s and 1980’s by Richard 
Beam, Robert Warming and Joseph Steger at NASA Ames Research Center 
are the basic building blocks for a large segment of the CFD codes in use 
today. The advantages of approximate factorization are realized every day 
by modern codes in both the structured arena (e.g. OVERFLOW [9]) and 
even in the unstructured world where LU factorizations are frequently em- 
ployed. We have not focused here on the analysis and application of these 
techniques, that work can be seen in the vast literature where the methods 
described here are widely employed. One just has to pick up any textbook 
which addresses CFD or any relevant journal to see numerous examples of 
the use of these pioneering concepts. 
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